Recombinant lectin and uses thereof

ABSTRACT

Disclosed herein are a recombinant  Streptomyces  S27S5 hemagglutinin (SHA), and homologues thereof, and a fusion protein of a fluorescent protein (such as GFP and mCherry1) and SHA or a homologue thereof, which specifically bind to carbohydrates, including oligomeric sugars that terminate in L-rhamnose or D-galactose. The SHA, SHA homologues, and fusion proteins can be used to detect a variety of microorganisms or cancer or tumor antigens.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present invention claims priority to U.S. Provisional ApplicationNos. 62/574,626 and 62/574,636, both filed on Oct. 19, 2017, thecontents of which are incorporated herein by reference in theirentireties, including the drawings.

STATEMENT OF GOVERNMENT INTEREST

The present invention was made with government support under Grant No.P30 CA33572, awarded by the National Institutes of Health (NIH). TheGovernment has certain rights in the invention.

BACKGROUND

In 1972, culture supernatants of 333 Actinomycetales bacterial strainsisolated from the greater Tokyo area were screened for hemagglutinationactivity to identify microbial lectins (1,2). The culture supernatant ofStreptomyces sp. 27S5 exhibited blood type B-specific activity. This wasvery unique at the time, because previously-known plant lectins wereeither A- or O-blood type-specific. Sixty mg of the identified lectin,named Streptomyces hemagglutinin (SHA), was purified to homogeneity froma 15-L culture broth of S. sp. 27S5 using gum arabic affinitychromatography, achieving a 13,300-fold enrichment with 64% recovery ofthe total activity (3). More than 200 mg of SHA was ultimately purifiedand subjected to various analyses. SHA was characterized as a smallprotein (˜11 kDa) with unique characteristics, such as rare blood type Bspecificity, an atypical tryptophan-rich nature, and twocarbohydrate-binding sites (3,4). Accordingly, further study of SHA isneeded.

SUMMARY

This disclosure relates to characterization, production of a recombinantSHA and homologues thereof and a fusion protein of a fluorescent proteinand SHA that specifically bind to L-rhamnose or D-galactose, and noveluses of SHA and the fusion protein in detecting microorganisms thatexpress carbohydrates containing L-rhamnose or D-galactose on thesurface, or detecting tumor-expressed carbohydrates capable ofspecifically binding to SHA. Certain cancer/tumor cells expressingcarbohydrates containing D-galactose on the surface can be detected bythe methods disclosed herein. In some embodiments, the fusion protein isa fusion protein of a green fluorescent protein and SHA (GFP-SHA). Insome embodiments, the fusion protein is a fusion protein of a redfluorescent protein and SHA (mCherry1-SHA). In some embodiments, thefusion protein is a non-aggregating protein. In some embodiments, thefusion protein is a soluble protein that is stable at about 4° C. for anextended period of time. In some embodiments, the fluorescent protein islinked to the N-terminus of SHA. In some embodiments, the fluorescentprotein is linked to the C-terminus of SHA. In some embodiments, thefluorescent protein and SHA are linked via an acidic linker.

In one aspect, the disclosure provided herein relates to a recombinantStreptomyces S27S5 hemagglutinin (SHA), homologues thereof, and fusionsproteins of a fluorescent protein (such as GFP or mCherry1) and SHA orhomologues thereof (GFP-SHA fusion proteins or mCherry1-SHA fusionproteins). SHA, homologues thereof, SHA or homologues labeled with amarker such as a fluorescein or a derivative thereof, and GFP-SHA ormCherry1-SHA fusion proteins specifically bind to carbohydrates,including oligomeric sugars that terminate in L-rhamnose or D-galactose.

In another aspect, the disclosure provided herein relates to a methodfor detecting a microbial infection in a subject, wherein the microbialcell expresses a carbohydrate containing L-rhamnose or D-galactose onthe surface. The method includes contacting a fluorescein or aderivative thereof labeled SHA or a fusion protein of a fluorescentprotein and SHA disclosed herein with a sample from the subject, anddetecting the fluorescence level in the sample, wherein the detection offluorescence in the sample indicates the presence of the microbialinfection. In some embodiments, the fusion protein is a fusion proteinof a green fluorescent protein and SHA (GFP-SHA). In some embodiments,the fusion protein is a fusion protein of a red fluorescent protein andSHA (mCherry1-SHA). In some embodiments, the SHA is labeled with afluorescein derivative such as fluorescein isothiocyanate (FITC). Insome embodiments, the sample includes a biopsy sample, a tissue sample,a bronchoalveolar lavage sample, a blood sample, and a urine sample. Insome embodiments, the microbial infection is caused by a bacterium or afungus that expresses dTDP-4-dehydrorhamnose reductase gene (rmID). Insome embodiments, the microbial infection includes mycoses caused byCandida albicans, Aspergillus fumigatus, or Fusarium solani. In someembodiments, the microbial infection is an infection by Streptococcus,Enterococcus or Lactococcus. In some embodiments, the microbialinfection is invasive pulmonary aspergillosis, and the GFP-SHA fusionprotein disclosed herein detects the presence of fungal galactomannan,indicating invasive pulmonary aspergillosis.

In another aspect, the disclosure provided herein relates to a methodfor detecting a cancer or tumor in a subject, wherein the cancer ortumor cell expresses a carbohydrate capable of specifically binding toSHA, a homologue thereof, a fragment of the SHA or a homologue thereof,a fluorescein or a derivative thereof labeled SHA, a homologue orfragment of the SHA, or a fusion protein of a fluorescent protein andSHA disclosed herein. In some embodiments, the cancer or tumor cellexpress a surface antigen containing D-galactose. In some embodiments,the method includes contacting a fluorescein or a derivative thereoflabeled SHA, or a fusion protein of a fluorescent protein and SHAdisclosed herein with a sample from the subject, and detecting thefluorescence level in the sample, wherein the detection of fluorescencein the sample indicates the presence of the cancer or tumor cell. Insome embodiments, the sample includes a biopsy sample, a tissue sample,a bronchoalveolar lavage sample, a blood sample, and a urine sample. Insome embodiments, the cancer includes colon cancer, pancreatic ductalcarcinoma and pancreatic cancer. In some embodiments, the fusion proteinis a fusion protein of a green fluorescent protein and SHA (GFP-SHA). Insome embodiments, the fusion protein is a fusion protein of a redfluorescent protein and SHA (mCherry1-SHA). In some embodiments, the SHAis labeled with a fluorescein derivative such as fluoresceinisothiocyanate (FITC).

In another aspect, the disclosure provided herein relates to a positronemission tomography (PET) probe comprising an SHA protein, a homologuethereof, a functional fragment of SHA or a homologue thereof, afluorescein or a derivative thereof labeled SHA or a homologue orfragment of the SHA, or a fusion protein of a fluorescent protein andSHA disclosed herein labeled with a positron-emitting isotope. In someembodiments, the fusion protein is a fusion protein of a greenfluorescent protein and SHA (GFP-SHA). In some embodiments, the fusionprotein is a fusion protein of a red fluorescent protein and SHA(mCherry1-SHA). In some embodiments, the SHA is labeled with afluorescein derivative such as fluorescein isothiocyanate (FITC).

In a related aspect, the disclosure relates to a method of imaging anorgan or tissue having a microbial infection or detecting a locationhaving a microbial infection caused by a microorganism expressing acarbohydrate containing L-rhamnose or D-galactose on the surface. Themethod entails administering to a subject suffering from or suspected ofsuffering from a microbial infection the PET probe described above, andimaging the organ or the tissue having the microbial infection by a PETscanning of the subject. Alternatively, the method entails administeringto a subject suffering from or suspected of suffering from a microbialinfection the PET probe described above, and detecting the location ofthe PET probe, thereby determining the location of the microbialinfection. In some embodiments, the PET probe is locally administered tothe subject. In some embodiments, the PET probe is systemicallyadministered to the subject, e.g., by intravenous injection.

In yet another related aspect, the disclosure relates to a method ofimaging a tumor or detecting a location having a cancer or tumor, wherethe cancer or tumor cell expresses an antigen comprising a carbohydratecapable of specifically binding to SHA, a homologue thereof, a fragmentof the SHA and a homologue thereof, or a fluorescein labeled SHA or ahomologue or fragment of the SHA. The method entails administering to asubject suffering from or suspected of suffering from a cancer or tumorthe PET probe described above, and imaging the organ or the tissuehaving the cancer or tumor by a PET scanning of the subject.Alternatively, the method entails administering to a subject sufferingfrom or suspected of suffering from a cancer or tumor the PET probedescribed above, and detecting the location of the PET probe, therebydetermining the location of the cancer or tumor cells. In someembodiments, the PET probe is locally administered to the subject. Insome embodiments, the PET probe is systemically administered to thesubject, e.g., by intravenous injection.

BRIEF DESCRIPTION OF THE DRAWINGS

This application contains at least one drawing executed in color. Copiesof this application with color drawing(s) will be provided by the Officeupon request and payment of the necessary fees.

FIG. 1 illustrates the primer design for construction of rSHA.

FIG. 2 demonstrates SDS-PAGE analysis of archived SHA and thioredoxin(Trx) fused-SHA. S. sp. 27S5-produced SHA and a recombinant SHAhomologue, Trx-SHA, were separated by SDS-PAGE on a 5-12% gradient geland visualized with Coomassie Blue stain. Archived SHA (lane 1) wasapplied to a gum arabic gel column from which SHA was eluted by 1 MD-galactose (lane 2) or 0.2 M L-rhamnose (lane 3) in the presence of 1 MNaCl. Recombinant SHA was expressed in E. coli as a Trx-fusion proteinand purified (lane 4). Gel image was assembled from three sections ofthe same gel; omitted spaces are indicated by the vertical lines.

FIG. 3 demonstrates the determination of the molecular mass of SHA.Electrospray Ionization (ESI) Fourier Transform Ion Cyclotron Resonance(FTICR)-MS of archived, forty-year old SHA revealed an average molecularmass of 13,314.67 Da, a monoisotopic mass of 13,306.65 Da, and thepresence of a covalently attached hexose in ˜25% of the SHA molecules.The isotope distributions of the molecular SHA ions with a charge stateof z=8 are magnified below the original spectrum.

FIG. 4 demonstrates LC MS/MS data of SHA proteolysis products alignedwith the deduced amino acid sequence from the homologous partialsequence of the putative polysaccharide deacetylase of S. lavendulae(PDSL, WP 051840348.1). Overlapping SHA peptides were generated byseparate enzymatic digestions and analyzed by high resolution OrbitrapLC-MS. Blue lines indicate database matches, grey lines mark matchesthrough de nova sequencing with PEAKS software, and red-boxed 0indicates methionine oxidation.

FIGS. 5A-5D show the primary structure of SHA. FIGS. 5A and 5B show thata single amino acid difference between recombinant SHA and the putativeSHA domain of PDSL was identified by MALDI-MS of peptides from SHA andTrx-SHA, obtained after digestion with trypsin and LysC (5A) or ArgC(5B). FIG. 5C shows Orbitrap LC-MS of two ArgC-digested peptides of SHAbefore (non-reduced) and after (reduced) reduction with TCEP. FIG. 5shows the sequence and primary structure of SHA derived from the resultsin FIGS. 5A-5C. Fragments identified by MS analyses (5A, 5B) areindicated by blue lines; S—S bonds identified by Orbitrap LC-MS(comparisons within 5C) are indicated by red linking lines; the A to Emutation at SHA position 108 (indicated by green labeled peak in 5A, andby blue linking lines in 5B) is highlighted in red.

FIGS. 6A-6C show glycan microarray analysis of archived and recombinantSHA. FIG. 6A shows representative heat maps of glycan-specificfluorescent signals (raw data in FIG. 7) in the absence (left panel) orin presence of 0.2 M L-rhamnose (right panel) using two differentconcentrations (1×, 0.1×) of SHA (top) or rSHA (bottom).

FIG. 6B shows quantification of the normalized fluorescent glycanbinding signals for archived and recombinant SHA, as in FIG. 6A (n=4).POS1-3 are positive controls containing standardized amounts ofbiotinylated IgGs; NEG are negative controls; numbers mark arraypositions of glycans with positive binding signals, as listed in FIG.6B. Linker molecules are SP: OCH₂CH₂CH₂NH₂ and SP1: NH(CH₃)OCH₂CH₂NH₂.All other glycans that did not bind SHA/rSHA are listed in Table 4. FIG.6C shows SDS-PAGE analysis of purified rSHA and archived SHA on a 4-12%gradient gel and visualized with Coomassie Blue staining.

FIG. 7 shows glycan microarray, original readouts from fluorescencescanner. Raw data of glycan-specific fluorescent signals in the absence(left panel) or presence of 0.2 M L-rhamnose (right panel) using twodifferent concentrations (1×, 0.1×) of SHA (top) or rSHA (bottom).Yellow numbers indicate array positions with positive signals (see FIG.6B for glycan identity).

FIGS. 8A-8C show SHA and homologues. FIG. 8A shows cross-species aminoacid sequence comparison of SHA and the 11 closest SHA homologues. Theboxes indicate SHA domains. Solid lines indicate experimentally defineddisulfide bonds in SHA. FIG. 8B shows phylogenetic tree of SHAhomologues (left). For percent sequence identity for proteins (center),number of identical residues/number of query residues matched areindicated in parentheses. Matching query length >131 indicatesadditional residues within the SHA homologues. For DNA (right), S.lavendulae DNA (438 bases) was used as the reference query for SHAhomologues. Data are shown as percent identity/percent query covered ofthe corresponding nucleotide sequence. FIG. 8C shows comparison of thethree SHA domains in the SHA protein.

FIG. 9 shows ¹H-NMR analysis for L-rhamnose binding to archived SHA.L-rhamnose was added to the SHA solution from 1 to 5 equivalents. Theindole NH region (left) and methyl region (right) of 1D ¹H-NMR spectraare shown. Dashed red lines are included for alignment. Chemical shiftchanges observed for specific peaks indicate binding of L-rhamnose toSHA.

FIG. 10 shows representative fluorescent micrographs of Lactobacilluscasei (Shirota) cells stained with recombinant GFP-SHA fusion protein(top left panel). Recombinant GFP was used as a negative control (topright panel). DAPI counterstaining is shown in blue in the mergedmicrographs (bottom).

FIG. 11 demonstrates ⁶⁸Ga PET imaging of CD-1 mice. These are the areasthat contain microorganisms. While the control mouse only showsradioactivity uptake in the kidneys and in the bladder, which is typicalfor small proteins, the ⁶⁸Ga-DOTA-GFP-SHA-injected mouse revealsadditional strong signals from the cecum and the small intestine. Thelatter are known internal organs that naturally contain a rhamnose-richmicrobial flora.

FIG. 12 shows GFP-SHA staining of LS180 cell surfaces. Cells werestained with either 10 μg/mL GFP-SHA or a GFP-only control as indicated,followed by visualization using a Zeiss LSM 880 confocal microscope.DAPI was used as a counterstain to label nuclear DNA. GF=GFP mode;DAP=DAPI mode; Merge=GF+DP overlay.

FIG. 13 shows hematoxylin and eosin (H&E) and fluorescent staining ofconsecutive slices of cancerous and normal tissues from City of Hopepatient. 40× magnification.

DETAILED DESCRIPTION

The following description of the invention is merely intended toillustrate various embodiments of the invention. As such, the specificmodifications discussed are not to be construed as limitations on thescope of the invention. It will be apparent to one skilled in the artthat various equivalents, changes, and modifications may be made withoutdeparting from the scope of the invention, and it is understood thatsuch equivalent embodiments are to be included herein.

In one aspect, disclosed herein is a fusion protein, the amino acidsequence of which fusion protein comprising the amino acid sequence of afluorescent protein and the amino acid sequence of SHA or an SHAhomologue, and which fusion protein specifically binds to L-rhamnose orD-galactose. SHA or the SHA homologue specifically binds to L-rhamnoseor D-galactose, and fusion to a fluorescent protein allows fluorescentdetection of the SHA protein while retaining the specific binding toL-rhamnose or D-galactose. In some embodiments, the fluorescent proteinincludes GFP and mCherry1.

In some embodiments, SHA or the homologue thereof is a recombinantprotein. In some embodiments, the amino acid sequence of SHA comprisesthree domains represented by SEQ ID NO: 17, SEQ ID NO: 19, and SEQ IDNO: 21, respectively. In some embodiments, each of the three domainscomprises SEQ ID NO: 24 at the C-terminus. In some embodiments, therecombinant SHA (rSHA) has the following amino acid sequence:ARTVCYAAHVEGIGWQGAVCDGAVAGTTGQSRRMEAAVIATSGTGGVCANAHLADIGWQGWACAADGKAVTVGTTGQSRRMEALGLQVGNGSVAAQAHVADYGWLNAEGGNPVYVGTTGQSRRMEAVRIWV (SEQ ID NO: 25). In some embodiments, the aminoacid sequence of the SHA homologue is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, or atleast 99% identical to SEQ ID NO: 25. In some embodiments, the aminoacid sequence of the SHA homologue is codon optimized.

In some embodiments, the fusion proteins encompassed in this disclosureinclude fusion proteins comprising GFP and a functional fragment of SEQID NO: 25 or a homologue thereof, as long as the functional fragment andthe fusion protein of the GFP-SHA functional fragment are able tospecifically bind to L-rhamnose and/or D-galactose. For example, afunctional fragment of SEQ ID NO: 25 is a peptide homologous to aconsecutive sequence of SEQ ID NO: 25 having substantially the same oreven improved binding affinity to L-rhamnose and/or D-galactosecomparing to the full length SHA protein represented by SEQ ID NO: 25.In some embodiments, the functional fragment is at least 20 amino acids,at least 30 amino acids, at least 40 amino acids, at least 50 aminoacids, at least 60 amino acids, at least 70 amino acids, at least 80amino acids, at least 90 amino acids, at least 100 amino acids, at least110 amino acids, or at least 120 amino acids in length.

In some embodiments, the fusion proteins encompassed in this disclosureinclude fusion proteins comprising mCherry1 and a functional fragment ofSEQ ID NO: 25 or a homologue thereof, as long as the functional fragmentand the fusion protein of the mCherry1-SHA functional fragment are ableto specifically bind to L-rhamnose and/or D-galactose. For example, afunctional fragment of SEQ ID NO: 25 is a peptide homologous to aconsecutive sequence of SEQ ID NO: 25 having substantially the same oreven improved binding affinity to L-rhamnose and/or D-galactosecomparing to the full length SHA protein represented by SEQ ID NO: 25.In some embodiments, the functional fragment is at least 20 amino acids,at least 30 amino acids, at least 40 amino acids, at least 50 aminoacids, at least 60 amino acids, at least 70 amino acids, at least 80amino acids, at least 90 amino acids, at least 100 amino acids, at least110 amino acids, or at least 120 amino acids in length.

In some embodiments, mCherry1 and SHA are fused via an acidic linkerwith the sequences shown below. In some embodiments, the acidic linkerincreases solubility of the fusion proteins.

mCherry1-acidic-linker-SHA DNA sequence (SEQ ID NO: 26) (bases 1-708:mCherry1 (shown in capital letters); bases 709-735: acidic linker (shownin small letters, underlined); bases 736-1128: SHA (shown in capitalletters, italic); bases 1129-1134: XhoI cloning site (shown in smallletters, italic, underlined); bases 1135-1152: hexahistidine tag (shownin small letters); and bases 1153-1155: stop codon (shown in capitalletters, bold):

¹ATGGTGAGCAAGGGCGAGGAGGATAACATGGCCATCATCAAGGAGTTCATGCGCTTCAAGGTGCACATGGAGGGCTCCGTGAACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCACCCAGACCGCCAAGCTGAAGGTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGACATCCTGTCCCCTCAGTTCATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCGACTACTTGAAGCTGTCCTTCCCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGTGACCGTGACCCAGGACTCCTCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAGCTGCGCGGCACCAACTTCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACTATGGGCTGGGAGGCCTCCTCCGAGCGGATGTACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAGCTGAAGGACGGCGGCCACTACGACGCTGAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTGCAGCTGCCCGGCGCCTACAACGTCAACATCAAGTTGGACATCACCTCCCACAACGAGGACTACACCATCGTGGAACAGTACGAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAG CTGTACAAG⁷⁰⁹ggtgacgaagtcgacgaagacgaaggt ⁷³⁶ GCGCGTACCGTTTGCTACGCGGCGCACGTTGAAGGTATCGGTTGGCAGGGTGCGGTTTGCGACGGTGCGGTTGCGGGTACCACCGGTCAGTCTCGTCGTATGGAAGCGGCGGTTATCGCGACCTCTGGTACCGGTGGTGTTTGCGCGAACGCGCACCTGGCGGACATCGGTTGGCAGGGTTGGGCGTGCGCGGCGGACGGTAAAGCGGTTACCGTTGGTACCACCGGTCAGTCTCGTCGTATGGAAGCGCTGGGTCTGCAAGTTGGTAACGGTTCTGTTGCGGCGCAGGCGCACGTTGCGGACTACGGTTGGCTGAACGCGGAAGGTGGCAACCCGGTTTACGTTGGCACTACTGGTCAGTCCCGTCGTATGGAAGCGGTTCGTATCTGGGTT ¹¹²⁹ ctcggg¹¹³⁵caccaccaccaccaccac¹¹⁵³ TGA

mCherry1-acidic-linker-SHA protein sequence (SEQ ID NO: 27) (AA 1-236:mCherry1; AA 237-245: acidic linker (italic, underlined); AA 246-378:SHA plus Leu-Glu from cloning site; and AA 379-384: hexahistidine tag):

¹Met-Val-Ser-Lys-Gly-Glu-Glu-Asp-Asn-Met-Ala-Ile-Ile-Lys-Glu-Phe-Met-Arg-Phe-Lys-Val-His-Met-Glu-Gly-Ser-Val-Asn-Gly-His-Glu-Phe-Glu-Ile-Glu-Gly-Glu-Gly-Glu-Gly-Arg-Pro-Tyr-Glu-Gly-Thr-Gln-Thr-Ala-Lys-Leu-Lys-Val-Thr-Lys-Gly-Gly-Pro-Leu-Pro-Phe-Ala-Trp-Asp-Ile-Leu-Ser-Pro-Gln-Phe-Met-Tyr-Gly-Ser-Lys-Ala-Tyr-Val-Lys-His-Pro-Ala-Asp-Ile-Pro-Asp-Tyr-Leu-Lys-Leu-Ser-Phe-Pro-Glu-Gly-Phe-Lys-Trp-Glu-Arg-Val-Met-Asn-Phe-Glu-Asp-Gly-Gly-Val-Val-Thr-Val-Thr-Gln-Asp-Ser-Ser-Leu-Gln-Asp-Gly-Glu-Phe-Ile-Tyr-Lys-Val-Lys-Leu-Arg-Gly-Thr-Asn-Phe-Pro-Ser-Asp-Gly-Pro-Val-Met-Gln-Lys-Lys-Thr-Met-Gly-Trp-Glu-Ala-Ser-Ser-Glu-Arg-Met-Tyr-Pro-Glu-Asp-Gly-Ala-Leu-Lys-Gly-Glu-Ile-Lys-Gln-Arg-Leu-Lys-Leu-Lys-Asp-Gly-Gly-His-Tyr-Asp-Ala-Glu-Val-Lys-Thr-Thr-Tyr-Lys-Ala-Lys-Lys-Pro-Val-Gln-Leu-Pro-Gly-Ala-Tyr-Asn-Val-Asn-Ile-Lys-Leu-Asp-Ile-Thr-Ser-His-Asn-Glu-Asp-Tyr-Thr-Ile-Val-Glu-Gln-Tyr-Glu-Arg-Ala-Glu-Gly-Arg-His-Ser-Thr-Gly-Gly-Met-Asp-Glu-Leu-Tyr-Lys-²³⁷ Gly-Asp-Glu-Val-Asp-Glu-Asp-Glu-Gly- ²⁴⁶Ala-Arg-Thr-Val-Cys-Tyr-Ala-Ala-His-Val-Glu-Gly-Ile-Gly-Trp-Gln-Gly-Ala-Val-Cys-Asp-Gly-Ala-Val-Ala-Gly-Thr-Thr-Gly-Gln-Ser-Arg-Arg-Met-Glu-Ala-Ala-Val-Ile-Ala-Thr-Ser-Gly-Thr-Gly-Gly-Val-Cys-Ala-Asn-Ala-His-Leu-Ala-Asp-Ile-Gly-Trp-Gln-Gly-Trp-Ala-Cys-Ala-Ala-Asp-Gly-Lys-Ala-Val-Thr-Val-Gly-Thr-Thr-Gly-Gln-Ser-Arg-Arg-Met-Glu-Ala-Leu-Gly-Leu-Gln-Val-Gly-Asn-Gly-Ser-Val-Ala-Ala-Gln-Ala-His-Val-Ala-Asp-Tyr-Gly-Trp-Leu-Asn-Ala-Glu-Gly-Gly-Asn-Pro-Val-Tyr-Val-Gly-Thr-Thr-Gly-Gln-Ser-Arg-Arg-Met-Glu-Ala-Val-Arg-Ile-Trp-Val-Leu-Glu-³⁷⁹His-His-His- His-His-His

The SHA homologues having strong binding affinity to L-rhamnose orD-galactose are encompassed in this disclosure. These SHA homologues canbe identified by their carbohydrate-binding properties, e.g., by usingthe commercially available Glycan Array 100 slides or other similarassays. Optionally, these SHA homologues can be further modified bysubstituting, deleting, or adding one or more amino acid residues to SEQID NO: 25. The modified SHA homologues may be tested for bindingaffinity to L-rhamnose and/or D-galactose to select the SHA homologueshaving similar or even improved binding affinity. Both SHA homologueswithout modification and the modified SHA homologues can be used fordeveloping the GFP-SHA fusion proteins described above.

In some embodiments, SHA, the homologues or fragments of SHA can belabeled by a fluorescein or a derivative thereof such as FITC for easydetection. The fluorescein labeling has minimal impact on the bindingactivities of SHA or the homologue or fragment thereof.

In some embodiments, labeled or unlabeled SHA, SHA homologues or SHAfragments, and fusion proteins comprising a fluorescent protein and SHAor a homologue or fragment thereof disclosed herein, specifically bindto D-galactose and glycans containing Gal-α-1-3. In some embodiments,labeled or unlabeled SHA, SHA homologues or SHA fragments, and fusionproteins comprising a fluorescent protein and SHA or a homologue orfragment thereof disclosed herein, specifically bind to β-Gal-; α-Rha-;Gal-α-1,3-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-Gal-β-1,4-Glc-β-;Gal-α-1,4-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-(Fuc-α-1,2)-Gal-β- (forexample, Blood B antigen trisaccharide);Gal-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β- (for example, Blood B antigentetrasaccharide); or Gal-α-1,3-Gal-β-. In some embodiments, labeled orunlabeled SHA, SHA homologues or SHA fragments, and fusion proteinscomprising a fluorescent protein and SHA or a homologue or fragmentthereof disclosed herein, specifically bind toGal-α-1,4-Gal-β-1,4-Glc-β-; GalNAc-β-1,3-Gal-β-1,4-Glc-β-; orGal-α-1,4-Gal-β-1,4-GlcNAc-β-. In some embodiments, labeled or unlabeledSHA, SHA homologues or SHA fragments, and fusion proteins comprising afluorescent protein and SHA or a homologue or fragment thereof disclosedherein, specifically bind to a carbohydrate terminating in or to apolysaccharide having one or more branches terminating in β-Gal-;α-Rha-; Gal-α-1,3-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-Gal-β-1,4-Glc-β-;Gal-α-1,4-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-(Fuc-α-1,2)-Gal-β-;Gal-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β-; Gal-α-1,3-Gal-β-,Gal-α-1,4-Gal-β-1,4-Glc-β-; GalNAc-β-1,3-Gal-β-1,4-Glc-β-; orGal-α-1,4-Gal-β-1,4-GlcNAc-β-. In some embodiments, the fluorescentproteins include GFP and mCherry1. In some embodiments, the fluoresceinor a derivative thereof includes FITC.

This disclosure demonstrates that archived SHA produced by S. sp. 27S5and purified 40 years ago remained intact and maintained itscarbohydrate-binding and hemagglutination (data not shown) activities,and that the molecular mass and primary structure of the archived SHAwere successfully determined using modern mass spectrometric/proteomicstrategies. The amino acid sequence of SHA was partially determined byEdman degradation methods in the 1970s, as described in the thesis ofYFY(7). That study found redundancy in the N-terminal amino acidsequences of BrCN-cleaved SHA peptides, which was reasoned to be due tothe presence of microheterogeneity in the purified protein. The primarystructure disclosed herein clearly reveals that the difficulty ofsequencing SHA in the 1970s was due to the three homologous SHA domains,which occupy 70% of the SHA molecule. It is fortuitous that the putativeSHA gene of S. lavendulae was found in the Streptomyces genome database,which was expanded within two months after this protein was firstrevisited after 40 years. Consequently, the primary structure of SHA wasrevealed at last.

FTICR-MS revealed an average molecular mass of 13,314.67 Da and thepresence of a covalently attached hexose in ˜25% of the SHA molecules.The MS results suggest that hexose may be a component of SHA. Glycationof Lys in macromolecules, including hemoglobin, serum albumin,crystalline, and collagens, has been well studied (12-14). Given thatthe original SHA was obtained from a culture medium containing 2%D-fructose, it is possible that D-fructose was non-enzymaticallyattached to ε-amino groups of Lys. SHA was exposed to a highconcentration of D-galactose after the original affinity purification,and significant amounts of D-galactose were found remaining in thearchived SHA sample. Thus, it is also possible that D-galactose presentin the SHA solution may have caused such a covalent linkage.Alternatively, it is possible that the hexose was addedpost-translationally by Streptomyces. However, the mass spectrometricdata indicated multiple hexose-modified residues and not a singledefined site (data not shown), hinting the presence of an inhomogeneouschemical reaction rather than a well-defined in vivo posttranslationalmodification.

The 131-amino acid primary structure of SHA was solved by showing thatpeptides derived from SHA aligned to the C-terminal two-thirds of theputative protein from S. lavendulae with >99% identity. Close comparisonof peptides derived from SHA and the SHA domain of the putative proteinrevealed a single amino acid substitution at the SHA-equivalent position108 in the putative protein, from E to A. Recombinant SHA(A108E) showedthe same carbohydrate-binding specificity and similar affinity forL-rhamnose as archived SHA. These results confirmed that SHA isidentical to the N-terminally truncated hypothetical protein in thegenome of S. lavendulae, except that, in the putative protein, E inSHA-position 108 is substituted by A. The SHA(A108E) gene was used toexpress SHA proteins in different forms, including GFP-SHA. After theconfirmation of L-rhamnose and D-galactose glycan specificity of theSHA(A108E) protein, this protein was designated as rSHA.

As the working examples demonstrate, SHA and eleven hypothetical proteinhomologues have three ChW-like SHA domains. To date, ChW domains havebeen exclusively found in the C. acetobutylicum species. The threeChW-like domains identified in SHA and its homologues representadditional examples for non-C. acetobutylicum proteins containing ChWdomain repeats. The ChW domain is 45-47 amino acids long, and featuresan absolutely conserved tryptophan and high contents of hydrophobic andsmall amino acids. SHA homologues contain five conserved tryptophanresidues, four of which are located in the three ChW-like SHA domains.Like the three SHA domains in SHA, the ChW domains cluster into groupsof threes, which suggests they function as a triplet (10). Althoughcarbohydrate recognition functions have been suggested (9), noconclusive study has been published as to the role of ChW domains.

The identified tryptophan residues may be involved in the bindingfunction of SHA. It was previously reported that the circular dichroism(CD) spectrum of SHA strongly resembled that of poly(L-tryptophan), andspeculated that tryptophan side chains contributed to a positive CD bandat 226 nm (3). It was also suggested a potential involvement oftryptophan residues in L-rhamnose binding to SHA (4). Those conclusionswere based on solvent-perturbation studies, which demonstrated that thenumber of solvent-exposed Trp (or average extent of exposure) was two inthe absence of L-rhamnose, and three in the presence of L-rhamnose. Thissuggested that one tryptophan residue appears outside as a result of SHAbinding to this sugar. Oxidation of two tryptophan residues withN-bromosuccinimide led to complete loss of its carbohydrate-bindingactivity, which also indicated that these tryptophan residues areimportant for retaining this activity (4). Using NMR, the current studyconfirmed the involvement of tryptophans in the binding of SHA toL-rhamnose. In analogy, another L-rhamnose-specific protein,α-L-rhamosidase of S. avermitilis has three tryptophan residues bindingto L-rhamnose via hydrophobic interaction to the pyranose ring of thesugar (15).

The above-mentioned structural information is helpful for understandingspecificity and affinity of SHA. It is important to carry out extensivebinding assays of SHA against a variety of glycans. In this study, thespecific binding of both archived SHA and rSHA was compared side byside, using the Glycan Array 100, and at two concentrations of SHA inthe absence and presence of L-rhamnose. Although semi-quantitative, theresults clearly revealed the following: (1) SHA bound to D-galactose andglycans containing Gal-α-1-3, which is the key signature of blood type Bspecificity, as well as L-rhamnose; (2) SHA bound to L-rhamnose with thehighest affinity among glycans tested, as evidenced by the fact that thebinding to L-rhamnose was still observed when other positive bindingsignals were abolished in the presence of 0.2M L-rhamnose; (3) SHA andrSHA showed the same glycan specificity profile, suggesting that rSHArepresents the authentic SHA. These results are consistent with thosepreviously published (2-4), confirming the blood type B and L-rhamnosespecific nature of SHA. Gum arabic has been effectively used to purifySHA and SHA fusion proteins in the past and this study. As previouslyreported (2), hemagglutination of type B-erythrocytes by SHA wasinhibited in the presence of plant-originated galactomannans, with guargum>locust bean gum>gum arabic. The glycan structure resembling guargum, locust bean gum, and gum arabic remains to be determined. Thesegalactomannans are used in foods as stabilizers, and it is interestingto note that in the clinical setting, fungal galactomannan is used as abiomarker for invasive pulmonary aspergillosis, a life-threateninginfection mainly affecting immunocompromised patients (16).

Although microbial lectins with similar characteristics to SHA have notbeen reported, significant data on L-rhamnose binding lectins (RBLs)from fish eggs are available (17-21). Interestingly, RBLs from a numberof different fish species are composed of two or three domainsconsisting of approximately 100 amino acids, which are known ascarbohydrate-recognition domains (RBL CRDs) (22,23). A lectin purifiedfrom sea urchin (Anthocidaris crassispina) eggs (SUEL) was reported tocontain a galactose-binding lectin domain (24), but was later shown tobind to L-rhamnose preferentially, which seems reasonable given thatL-rhamnose and D-galactose share the same hydroxyl group orientation atC2 and C4 of the pyranose ring structure (22,23). The RBL CRD, alsocalled SUEL-type lectin domain, is composed of eight highly conservedhalf-Cys and several other conserved segments, e.g., YGA in theN-terminal and DP and K in the C-terminal domain (22). However, RBL CRDshows no homology to SHA domains, due to its domain size, which is overthree times longer than the SHA domain; the absence of tryptophan, whichis the signature of SHA domains; and its heavily disulfide-linked domainstructure.

The functions of L-rhamnose-specific lectins are of particular interest.One suggested physiological role of fish egg lectins is as a defensemechanism against pathogenic bacteria (17). Rhamnose-binding lectinsfrom salmon and trout are involved in innate immunity and recognition oflipopolysaccharides (LPS) or lipoteichoic acid (LTA), respectively, onthe cell surface of bacteria (20,25). In contrast to animal lectins,lectins produced by microorganisms have different functions. Bacterialsurface agglutinins with mannose specificity play roles in cell-cellinteractions, as well as in microbial pathogenicity (26). The relatedfunctions of SHA are expected to include interactions with outsidecells, such as attaching to neighboring plants and surroundingmicroorganisms, in addition to potential defense mechanisms. The closestSHA homologue was found in the S. lavendulae genome encoding a putativeprotein. If expressed by Streptomyces, this enzyme would be expected tocatalyze the N- or O-deacetylation of acetylated sugars on the membranesof Gram-positive bacteria. However, it is not likely that SHA has suchdeacetylation activities, as SHA does not seem to recognize N-acetylatedcarbohydrates, as seen in the glycan array results.

The comparison of SHA to genomically-derived hypothetical proteinsrevealed the intriguing observation that the SHA-homologous domains ofall eleven hypothetical proteins are localized in the C-terminal regionsof the larger ORFs. Under the culturing conditions described (3),expression of the SHA-homologous proteins encoded by the genomes of S.lavendulae and S. sp. Mg1 was not observed (data not shown). Incontrast, when the original study was performed in the 1970s, three HAactivity-positive strains were identified from the 333 Actinomycetalesculture supernatants screened (1). During the original screening,culture supernatants were serially diluted and incubated with blood typeA, B, O, or AB erythrocyte suspensions. Supernatants that showed HAactivity at 4- or 8-fold dilutions on titer plates were considered to besubstantially positive; this included S. sp. 27S5 (1). SHA was purifiedfrom culture supernatants of S. sp. 27S5 by gum arabic affinitychromatography (3). It is possible that SHA could have been expressed asa precursor protein with an unknown N-terminal sequence, a signalsequence, and a protease-processing site, so that SHA molecules could befound in the culture broth, as observed 40 years ago.

As disclosed herein, the recombinant GFP-SHA binds to L. casei Shirotacells. Additional bacteria and fungi can be screened to identifymicroorganisms that interact with SHA. A similar approach was reportedfor a recombinant horseshoe crab plasma lectin that recognizes specificpathogen-associated molecular patterns of bacteria through L-rhamnose(27).

The SHA protein and homologues thereof, as well as fusion proteins ofGFP and an SHA protein, an SHA homologue, or a functional fragmentthereof (GFP-SHA and mCherry1-SHA fusion proteins) have a variety ofnovel uses in detecting the presence of certain microorganisms or canceror tumor cells, diagnosing certain microbial infections or certain typesof cancer or tumor, and detecting or imaging the location of microbialinfections, or cancer or tumor, e.g., by PET scanning. Ideally, thefusion proteins are soluble, non-aggregating and stable for an extendedperiod of time. The fusion proteins also retain the binding activity ofSHA, a homologue or fragment thereof. The fusion proteins can bind wellto the gum-Arabic carbohydrate column material and can be eluted withL-rhamnose or D-galactose.

For example, various bacteria or fungi expressing dTDP-4-dehydrorhamnosereductase gene (rmID) may be detected in vitro by contacting thebacteria or fungi with a GFP-SHA or mCherry1-SHA fusion proteindisclosed herein and monitoring the presence or change of fluorescence.In some embodiments, the bacterial or fungal cell expresses acarbohydrate containing L-rhamnose or D-galactose and display thecarbohydrate on the surface of the cell. The GFP-SHA and mCherry1-SHAfusion proteins can detect the presence of such microorganisms invarious liquid samples from a subject, e.g., a biopsy sample, a tissuesample, a bronchoalveolar lavage sample, a blood sample, and a urinesample. Alternatively, if the bacterial or fungal cell expresses acarbohydrate containing L-rhamnose or D-galactose but does not displaythe carbohydrate on the surface of the cell, the GFP-SHA andmCherry1-SHA fusion proteins can detect the presence of suchmicroorganisms in fixed tissue samples, e.g., paraffin-fixed orformalin-fixed and paraffin-embedded tissue samples.

Similarly, various cancer or tumor cells expressing tumor-specificcarbohydrates can be detected using the GFP-SHA and mCherry1-SHA fusionproteins disclosed herein. Such carbohydrates contain or terminate inbeta-Galactose- (β-Gal-); Gal-α-1,3-; Gal-β-1,3-GlcNAc-β-;Gal-α-1,3-Gal-β-1,4-Glc-β-; Gal-α-1,4-Gal-β-1,3-GlcNAc-β-;Gal-α-1,3-(Fuc-α-1,2)-Gal-β-; Gal-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β-;Gal-α-1,3-Gal-β-; Gal-α-1,4-Gal-β-1,4-Glc-β-; GalNAc-β-1,3-;Gal-β-1,4-Glc-β-; or Gal-α-1,4-Gal-β-1,4-GlcNAc-β-. In some embodiments,the core structures of the Thomsen Friedenreich and mucin antigensterminate in galactose and therefore, can be detected by the GFP-SHA andmCherry1-SHA fusion proteins disclosed herein. The GFP-SHA andmCherry1-SHA fusion proteins can detect the presence of such cancer ortumor cells in various liquid samples from a subject, e.g., a biopsysample, a tissue sample, a bronchoalveolar lavage sample, a bloodsample, and a urine sample. Alternatively, if the cancer or tumor cellexpresses a carbohydrate capable of specifically binding to SHA, ahomologue thereof, or a fragment of the SHA or a homologue thereofdisclosed herein but does not display the carbohydrate on the surface ofthe cell, the GFP-SHA and mCherry1-SHA fusion proteins can detect thepresence of such a cancer or tumor cell in fixed tissue samples, e.g.,paraffin-fixed or formalin-fixed and paraffin-embedded tissue samples.In some embodiments, the cancer is colon cancer.

The GFP-SHA and mCherry1-SHA fusion proteins can be used for diagnosisof various microbial infections caused by one or more microorganismsexpressing a carbohydrate containing L-rhamnose or D-galactose orvarious cancers or tumors expressing tumor-specific carbohydratescapable of specifically binding to SHA, a homologue thereof, or afragment of the SHA or a homologue thereof disclosed herein. The methodentails the step of contacting a sample obtained from a subjectsuffering from a microbial infection or a cancer or tumor with a GFP-SHAfusion protein or an mCherry1-SHA fusion protein, and determining thefluorescence level in the sample, wherein the presence of thefluorescence indicating the condition of the microbial infection orcancer or tumor. In some embodiments, the method further entails thestep of contacting a sample obtained from a negative control subject,such as a healthy subject or the subject before the microbial infectionor without cancer or tumor, with a GFP-SHA or mCherry1-SHA fusionprotein, and comparing the fluorescence levels of the sample of thenegative control subject with the sample of the subject suffering fromthe microbial infection or cancer or tumor, wherein the difference inthe fluorescence levels indicating the microbial infection or presenceof cancer or tumor. Alternatively, the fluorescence level of a negativecontrol subject can be established by an average or median fluorescencelevel of a population of healthy subjects who do not suffer from themicrobial infection or cancer or tumor. In some embodiments, the canceris colon cancer.

In a related aspect, this disclosure relates to a method of determiningthe prognosis of treating a microbial infection caused by one or moremicroorganisms expressing a carbohydrate containing L-rhamnose orD-galactose or a cancer or tumor expressing a carbohydrate tumor antigencapable of specifically binding to SHA, a homologue thereof, or afragment of the SHA or a homologue thereof disclosed herein. The methodentails the step of contacting a sample obtained from a subjectsuffering from a microbial infection or a cancer or tumor with a GFP-SHAor mCherry1-SHA fusion protein to determine the fluorescence level,treating the subject suffering from a microbial infection with one ormore antimicrobial agents or the subject suffering from a cancer ortumor with one or more cancer therapies, contacting a sample obtainedfrom the subject after the treatment with a GFP-SHA or mCherry1-SHAfusion protein to determine the fluorescence level, and comparing thefluorescence levels before and after the treatment to determine theprognosis of the treatment. The method can further compriseadministering to the subject an alternative antimicrobial agent orcancer therapy or an additional amount of the antimicrobial agent orcancer therapy if a desired prognosis is not achieved. In someembodiments, the cancer is colon cancer.

Infections caused by various microorganisms or certain types of cancersor tumors can be detected based on the specific binding of the GFP-SHAor mCherry1-SHA fusion proteins disclosed herein with the carbohydratecontaining L-rhamnose or D-galactose displayed on the surface of themicroorganisms or the carbohydrate tumor antigen capable of specificallybinding to SHA, a homologue thereof, or a fragment of the SHA or ahomologue thereof disclosed herein. As demonstrated in the workingexamples, the GFP-SHA or mCherry1-SHA fusion proteins and fluoresceinlabeled SHA disclosed herein can specifically bind to β-Gal-; α-Rha-;Gal-α-1,3-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-Gal-β-1,4-Glc-β-;Gal-α-1,4-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-(Fuc-α-1,2)-Gal-β-;Gal-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β-; Gal-α-1,3-Gal-β-,Gal-α-1,4-Gal-β-1,4-Glc-β-; GalNAc-β-1,3-Gal-β-1,4-Glc-β-; orGal-α-1,4-Gal-β-1,4-GlcNAc-β-. The GFP-SHA fusion proteins can be usedto detect microbial infections caused by a microorganism expressing acarbohydrate terminating in or otherwise exposing the aforementionedmonosaccharides or oligosaccharides. Alternatively, the GFP-SHA andmCherry1-SHA fusion proteins can be used to detect cancer or tumor cellsexpressing a carbohydrate tumor antigen terminating in or otherwiseexposing the aforementioned monosaccharides or oligosaccharides. In someembodiments, the cancer is colon cancer.

In yet another related aspect, disclosed herein is a method of imaging alocal microbial infection site caused by a microorganism expressing acarbohydrate containing L-rhamnose or D-galactose. Alternatively, themethod of imaging a tumor site, where the tumor cell expresses atumor-specific antigen including a carbohydrate capable of specificallybinding to SHA, a homologue thereof, or a fragment of the SHA or ahomologue thereof disclosed herein, can be performed in a similar way.In some embodiments, the cancer is colon cancer. A GFP-SHA ormCherry1-SHA fusion protein, labeled or unlabeled SHA protein or ahomologue thereof, or a functional fragment of the SHA protein or ahomologue thereof, can be labeled with a PET isotope to produce a PETprobe. The method entails the step of administering to a subjectsuffering from a microbial infection or a cancer or tumor the PET probe,and performing a PET scanning of the subject to image the location ofthe microbial infection or cancer or tumor, or to detect the location ofthe PET probe, thereby determining the location of the microbialinfection or the cancer or tumor. In some embodiments, the PET probe isadministered to the subject by intravenous injection. In someembodiments, the PET probe is locally administered to the microbialinfection site or the tumor site.

The selection and use of a suitable PET probe can be done based on theknowledge in the field (29). For example, the PET imaging can beperformed using a DOTA-labeled GFP-SHA or mCherry1-SHA fusion protein.DOTA is a chelator (1,4,7,10-tetraazacyclododecane-1,4,7,10-tetraaceticacid) that is used to covalently attach PET imaging metal isotopes toproteins. For example, ⁶⁸Ga or ⁶⁴Cu can be used in this technique. Thereare many other PET metal isotopes that can be used and are compatiblewith DOTA chelation. Other labeling techniques also can be used, forexample, non-metal PET isotopes including 124-Iodine, 18F, etc.

In some embodiments, the DOTA labeling is conducted via an attachment toamino acids such as lysine (via amino groups) or cysteines (via thiols).In some embodiments, the DOTA labeling is attached to one or more aminoacids located in the SHA protein, a homologue thereof, or a functionalfragment of the SHA protein or a homologue thereof. In some embodiments,the DOTA labeling is attached to one or more amino acids located in GFPor mCherry1 of a GFP-SHA or mCherry1-SHA fusion protein.

The following examples are provided to better illustrate the claimedinvention and are not to be interpreted as limiting the scope of theinvention. To the extent that specific materials are mentioned, it ismerely for purposes of illustration and is not intended to limit theinvention. One skilled in the art may develop equivalent means orreactants without the exercise of inventive capacity and withoutdeparting from the scope of the invention.

EXAMPLES

Materials and Methods

Materials—

S. lavendulae strain NCIB 6959/ATCC 14158 and HEK293S cells werepurchased from ATCC (Manassas, Va.). S. sp. strain Mg1 was a kind giftfrom Dr. Paul Straight of Texas A&M University (6). E. coli C41(DE3) andE. Clone® were from Lucigen (Middleton, Wis.). Gum arabic was purchasedfrom Sigma-Aldrich (St. Louis, Mo.). MS-grade Trypsin, LysC, ArgC, V8protease, and pepsin were from Promega (Madison, Wis.). Chymotrypsin wasfrom Worthington Biochemical (Lakewood, N.J.). pET32b and pcDNA3.1vectors were from Merck Millipore (Billerica, Mass.) and Thermo FisherScientific (Waltham, Mass.), respectively.

Purification and Characterization—

SHA was purified forty years ago as described (3) and kept frozen at−80° C. The purity and quality of the archived SHA were determined usingSDS-PAGE. The N-terminal amino acid sequence of SHA was determined usingEdman degradation performed on the Procise 494HT Protein SequencingSystem (Applied Biosystems, Thermo Fisher Scientific).

Specific Binding of SHA to Gum Arabic Gels—

Gum arabic gels were prepared according to published methods (3). Thearchived SHA as well as recombinant SHA proteins were applied to the gumarabic gel column. After washing the column, SHAs were eluted witheither 1 M D-galactose in the presence of 1 M NaCl as described (3), or0.2 M L-rhamnose in the presence of 1 M NaCl.

NMR Titration Study—

NMR analysis was performed using a DRX-500 spectrometer equipped with acryogenic TXI probe (Bruker BioSpin, Billerica, Mass.). The probetemperature was set to 298 K. Archived SHA (0.1 mg) was dissolved in 500μl of 20 mM sodium phosphate buffer, pH 6.5 (H₂O:D₂O=9:1). L-rhamnosesolution was added to the SHA solution at molar ratios from 1:0 to 1:5(SHA:L-rhamnose). Data processing and analysis were performed usingXWIN-NMR (ver. 3.5, Bruker BioSpin). NMR spectra were displayed withXWIN-PLOT (ver. 3.5, Bruker BioSpin).

Mass Spectrometry—

To determine molecular mass, the intact archived SHA was analyzed usingelectrospray ionization (ESI) Fourier Transform Ion Cyclotron Resonance(FTICR)-MS on a Thermo LTQ FTICR (Thermo Fisher) at ˜500,000 resolution.

To determine the amino acid sequence of SHA, overlapping SHA peptideswere obtained by performing separate enzymatic digestions with trypsin,chymotrypsin, LysC, ArgC, V8 protease, and pepsin, and analyzed by LC-MSon an Orbitrap Fusion Tribrid Mass Spectrometer (Thermo FisherScientific, Waltham Mass.), as well as by MALDI-MS on a SimulTof Combo200 instrument (SimulTOF Systems, Virgin Instruments, Marlborough,Mass.). MS and MS/MS collision-induced dissociation (CID) fragmentationdata from these peptides were analyzed with Xcalibur software (ThermoFisher Scientific) and with PEAKS Studio software (BioinformaticsSolutions Inc., Waterloo, Ontario, Canada).

SHA disulfide bond determination was made using MALDI-MS and high(120,000) resolution Thermo Orbitrap Fusion Tribrid Mass Spectrometeranalysis of the intact protein and the digested protein, before andafter reduction with 50 μM TCEP, pH 2.0, at 80° C. for 30 min.

Expression of an SHA Homologous Recombinant Protein—

The SHA homologous domain of the putative protein from S. lavendulae wasexpressed, which showed the highest homology to SHA (>99% identity), asa recombinant protein. To develop this recombinant SHA homologue, asynthetic gene expressing a wild-type SHA of the putative protein and amutant SHA gene with an A to E amino acid substitution at position 108(A108E), were produced using E. coli codon-optimized overlapping oligoDNA primers, and cloned into pET32b (Table 1). The primer binding sitesare illustrated in FIG. 1.

TABLE 1 SEQ Primer ID Name NO: DNA Sequence P1f  1TGCGCGAACGCGCACCTGGCGGACATCGGTTGGCAGGG TTGGGCGTGCGCGGCGGACGGT P2f  2CGTATGGAAGCGGCGGTTATCGCGACCTCTGGTACCGG TGGTGTTTGCGCGAACGCGCAC P3f  3GTTTGCGACGGTGCGGTTGCGGGTACCACCGGTCAGTC TCGTCGTATGGAAGCGGCGGTT P4f  4GCGGCGCACGTTGAAGGTATCGGTTGGCAGGGTGCGGT TTGCGACGGTGCGGTTGCGGGT 5af  5AAAGAATTCGCGCCGGCGGCGCGTACCGTTTGCTACGC GGCGCACGTTGAAGGTATCGGT P1r  6TCCATACGACGAGACTGACCGGTGGTACCAACGGTAAC CGCTTTACCGTCCGCCGCGCAC P2r  7GCCTGCGCCGCAACAGAACCGTTACCAACTTGCAGACC CAGCGCTTCCATACGACGAGAC P3r  8TTACCACCCGCCGCGTTCAGCCAACCGTAGTCCGCAAC GTGCGCCTGCGCCGCAACAGAA P4ar  9CGACGGGACTGACCAGTAGTGCCAACGTAAACCGGGTT GCCACCCGCCGCGTTCAGCCAA P5ar 10TTTCTCGAGTTAAACCCAGATACGAACCGCTTCCATAC GACGGGACTGACCAGTAGTGCC PDSL 11GACTACGGTTGGCTGAACGCGGAAGGTGGCAACCCGGT A108Ef TTACGTTGGC PDSL 12GCCAACGTAAACCGGGTTGCCACCTTCCGCGTTCAGCC A108Er AACCGTAGTC

The recombinant wild-type SHA was expressed in E. coli C41(DE3) as athioredoxin (Trx) fusion protein with His-tag. Trx-SHA was purified fromE. coli cell pellets derived from a 2-L culture by solubilization andaffinity purification on a Ni-NTA resin (Thermo Fisher Scientific). Thepurified wild-type SHA was digested with multiple enzyme combinations,as described above for SHA, to compare resulting peptides from bothproteins.

Due to solubility issues various fusion proteins of the recombinant SHAwere prepared and expressed in E. coli. Of those, a yeastSUMO(SMT3)-fusion protein was successfully purified for comparingcarbohydrate-binding specificity with that of archived SHA. Briefly,SMT3-fused SHA(A108E) was prepared by insertion at the SMT3 and Ulp1cleavage sites of pET32b/SHA(A108E). E. Cloni® (Lucigen) was transformedby pET32b/SMT3-SHA(A108E). SMT3-SHA(A108E) was purified using a His6-tagspecific nickel-NTA column from transformed cells after solubilizationwith 5M urea/B-Per lysis buffer (Pierce), followed by refolding in thepresence of 1 M galactose and 10 mM β-mercaptoethanol. SHA(A108E) wascleaved off from SMT3 bound to the column by incubating with UPL1. Theresulting SHA(A108E) was purified by gum arabic gels. The authenticityof SHA(A108E) was confirmed by SDS-PAGE and glycan microarray analyses.

Glycan Microarray Analyses—

Microarray analysis was performed according to the manufacturerrecommendations using RayBio Glycan Array 100 (RayBiotech, Norcross Ga.)slides. Each slide contains four sub-microarrays printed with 100synthetic glycans. Briefly, 200 μL of 0.1 mg/mL of both archived SHA andrSHA were dialyzed overnight at 4° C. against 1×PBS dialysis buffer toavoid contaminating samples with amines prior to biotinylation. Dialyzedsamples were incubated with biotin-containing reaction solution at 22°C. for 30 min. Sub-arrays were blocked for 30 min at 22° C. Afterbiotinylated SHA samples were diluted with 1×PBS, 400 μL of each samplewas added to each sub-array. Slide #1 sub-arrays were incubated with 400μL of 20 μg/mL (1×) or 2 μg/mL (0.1×) SHA, in the absence or presence of0.2 M L-Rhamnose. Slide #2 sub-arrays were incubated with 400 μL of 20μg/mL (1×) or 2 μg/mL of rSHA (0.1×), in the absence or presence of 0.2M L-rhamnose. Slides were incubated for 16 h at 4° C. for highestintensities. Washing was performed according to the manufacturer'sprotocol, followed by incubation with Cy3 dye-conjugated streptavidin.The slides were incubated at 22° C. for 1 h with gentle shaking, thenwashed multiple times as recommended. The signals were visualized usingan Agilent DNA microarray scanner (Model G250° C.; Agilent, Santa Clara,Calif.) at 532 nm for Cy3. Data extraction and analysis was performedafter subtraction of the background and normalization to the internalreferences provided by the manufacturer, using an ImageJ Protein ArrayAnalyzer software (28).

Staining of Lactobacillus casei (Shirota) Cells by Fluorescently LabeledSHA—

Recombinant GFP-SHA was expressed by inserting the SHA(A108E) gene atthe C-terminus of GFP in pET28/GFP, followed by transformation of E.cloni cells. GFP-SHA was purified from cell pellets collected from 4-Lculture, after solubilization with 5 M urea/B-Per lysis buffer (Pierce),using a His6-tag specific nickel-NTA column, followed by refolding inthe presence of 1 M galactose and 10 mM β-mercaptoethanol, and elutingwith 400 mM imidazole. GFP-SHA was concentrated using Centricon YM10centrifugal filters (Fisher Scientific) and purified by FPLC withSuperdex 75G (GE Healthcare Life Science, Pittsburgh, Pa.).

L. casei Shirota cells were isolated from commercially available Yakultyogurt drink. The authenticity of L. casei Shirota was verified bySanger sequencing of its 16S rRNA by showing 100% match to the referencesequence AB531131. Four hundred ml Difco™ Lactobacilli MRS Broth (FisherScientific) was inoculated with L. casei Shirota cells at aconcentration of 10⁶ cells/mL. L. casei Shirota cells were grown for 16h at 37° C., harvested by centrifugation, and washed three times with1×PBS. Cells were re-suspended in 5 ml 70% ethanol and incubated at 22°C. for 30 min under continuous rotation. Cells were washed three timeswith 1×PBS, then re-suspended in 5 ml 1×PBS. Bacterial cells wereblocked for non-specific binding with 3% BSA in PBS and NP-40 (0.5%) for30 min, followed by 1 h incubation with 50 μM of GFP-SHA or GFP as anegative control. Cells were washed three times with 1×PBS and finallyre-suspended in 1 mL PBS containing 10% glycerol. Bacteria cells werecounter-stained with DAPI (3 μM) and examined using a Zeiss Observer IIsystem (Carl Zeiss, Jena, Germany). Fluorescent images were analyzedusing Image-Pro Plus and ZEISS ZEN software (Carl Zeiss, Jena, Germany).

Example 1: Preservation of Active SHA

It was first confirmed that the SHA protein that was purified 40 yearsago and archived in a frozen state, was intact, readily bound to a gumarabic affinity chromatography column, and specifically eluted with acompeting monosaccharide D-galactose or L-rhamnose as shown in FIG. 2,lanes 1-3.

Example 2: Determination of the Molecular Mass of SHA Using MassSpectrometry

The molecular mass of SHA was previously estimated to be approximately11 kDa, based on various approaches, including gel filtration in thepresence of 6 M guanidine hydrochloride, SDS-PAGE, and sedimentationequilibrium analysis (3). Electrospray ionization (ESI) FourierTransform Ion Cyclotron Resonance (FTICR) MS was applied to determinethe molecular mass of SHA more precisely. This high-resolution massspectrometric technique revealed a precise average molecular mass of13,314.67 Da, a monoisotopic mass of 13,306.65 Da, and the presence of acovalently attached hexose in ˜25% of the SHA molecules (FIG. 3).

Example 3: Identification of SHA Homologues in Streptomyces Genomes

To determine the sequence identity of SHA, bottom-up proteomicsexperiments were conducted. SHA was digested separately with severalproteases to generate overlapping peptides. These peptides were thenanalyzed by liquid chromatography (LC) coupled with high-resolutionmultistage mass spectrometry (MS/MS). An initial database search wasperformed and revealed a closely matching SHA homologue in the genome ofStreptomyces sp. Mg1 as a hypothetical protein (GenBank accession#EDX26679.1) (6); data not shown. More Streptomyces genome sequencesbecame available later; subsequently, a more refined search led to theidentification of a homologue in S. lavendulae with even better scoresfor MS/MS database matching. The digested SHA peptides aligned almostcompletely with the deduced C-terminal amino acid sequence of theputative polysaccharide deacetylase of S. lavendulae (Accession numberWP_051840348.1; FIG. 4). SHA matched with the C-terminal 131 amino acidsof the hypothetical 199-amino acid protein, except for a partialsequence stretch consisting of nine amino acids from SHA-position101-109. However, the mass spectrometric data did not cover any sequenceof the N-terminal portion of either the S. sp. Mg1 or S. lavendulaeprotein, comprising 74 and 68 amino acid residues, respectively.

Example 4: Determination of the N-Terminal Amino Acid Sequence of SHA

Previous amino acid sequencing of reduced and carboxymethylated SHArevealed the N-terminal amino acids to be AxTVCYAAxV (SEQ ID NO: 13)(7); x indicates an undetermined residue. To confirm these results andidentify additional amino acids in the sequence, N-terminal sequencingof the archived SHA was performed. Approximately 30 amino acids wereidentified to be ARTVcYAAHVEGIGWQGAVcDGAVAxTtxQsRr (SEQ ID NO; 14)(lowercase letters indicate tentative identification). Together, the twoindependent N-terminal sequencing results strongly suggested that theN-terminal sequence of SHA was ARTVCY (SEQ ID NO: 15).

Example 5: Solution of the Primary Structure of SHA

By considering the N-terminal sequencing information, the molecular massof the intact SHA protein, and the database matching with digestedpeptides, the SHA sequence appeared to be almost identical to theC-terminal portion of the putative protein, residues 69-199. To identifyhow the amino acid sequences differ between SHA and the SHA domain ofthe putative protein, a recombinant thioredoxin (Trx)-SHA fusion proteinwas generated and peptide fingerprints of the recombinant SHA and SHAputative protein were compared. First, the homologous SHA domain fromthe putative protein was cloned into a PET32 vector to transform E.coli, from which the fusion protein was purified using Ni-NTA resin(FIG. 2, lane 4). Then the purified recombinant Trx-SHA fusion proteinwas digested using multiple enzymes to generate overlapping peptides forLC-MS and MALDI-MS analyses, as for SHA above. Finally, the LC-MS/MSdata sets of the digested peptides from the recombinant SHA and itshomologue SHA putative protein were compared. It was found that thesequence of the recombinant SHA differed from that of SHA putativeprotein by a single A108E change (FIGS. 5A and 5B). This was confirmedby calculating and comparing the molecular masses for the recombinantSHA and SHA putative protein as a 58-Da mass difference.

To determine disulfide bonds in SHA, endoproteinase ArgC was used todigest archived SHA, with or without reduction with tris(2-carboxyethyl)phosphine (TCEP), followed by high resolution Orbitrap LC-MS. Comparisonof the spectra of two digested peptides before and after TCEP reductionshowed a clear 2-Da mass difference (FIG. 5C). This indicates that SHAcontains two consecutive disulfide bonds that connect cysteine residuesC05 with C20 and C48 with C63, as illustrated in FIG. 5D. No otherdisulfide bond-connected peptides were detected. Taken together, theseresults allowed deduction of the primary structure of SHA (illustratedin FIG. 5D, and summarized in Tables 2 and 3, along with repetitivedomain structures).

Table 2 demonstrates that three homologous SHA domains, consisting of 92amino acids, form 70% of the total amino acids of SHA. The primarysequence of SHA is principally composed of three homologous SHA domains1, 2, and 3, consisting of 29, 33, and 30 amino acids, respectively.Together, the three SHA domains comprise 92 amino acids, 70% of thetotal 131 amino acids in SHA. Underlining indicates the completelymatched 11-amino acid sequences in these domains. Homology among thethree SHA domains is shown in FIG. 8C.

TABLE 2 Number ChW- of like Location Amino acid sequence residuesdomains 1-7 ARTVCYA   7 (SEQ ID NO: 16)  8-36 AHVEGIGWQGAVCDGAVA 29 SHAGTTGQSRRMEA domain (SEQ ID NO: 17) 1  37-50 AVIATSGTGGVCAN  14(SEQ ID NO: 18) 51-83 AHLADIGWQGWACAADGK 33 SHA AVTVGTTGQSRRMEA domain(SEQ ID NO: 19) 2 84-96 LGLQVGNGSVAAQ  13 (SEQ ID NO: 20)  97-126AHVADYGWLNAEGGNPVY 30 SHA VGTTGQSRRMEA domain (SEQ ID NO: 21) 3 127-131VRIWV   5 (SEQ ID NO: 22)

Table 3 demonstrates the homology between ChW and SHA domains. SHAdomains 1, 2, and 3 were compared to Clostridium acetobutylicum ATCC 824protein Q97E41, which was found using SMART (simple modular architectureresearch tool). The key signature of ChW domains, tryptophan (W), isunderlined.

TABLE 3 ChW AHVQNIGWQDWVSNGAEAGTDGKGLRVEAL Identity domain RIKLENMP(SEQ ID NO: 23) SHA AHVEGIGWQGAVCDGAVAGTTGQSRRMEA 17/29 = domain (SEQ ID NO: 17) 59% 1 SHA AHLADIGWQGWACAADGKAVTVGTTGQSRR 13/33 = domain MEA 39% 2 (SEQ ID NO: 19) SHA AHVADYGWLNAEGGNPVYVGTTGQSRRMEA 11/30 =domain  (SEQ ID NO: 21) 37% 3

Example 6: Characterization of the Carbohydrate-Binding Properties ofSHA

In contrast to archived SHA, the first Trx-SHA fusion protein generatedas described above was poorly soluble and not suitable for functionalanalyses. Therefore, a novel construct that encoded an E. colicodon-optimized, His-tagged, Trx-SMT3 (SUMO family protein)-SHA(A108E)fusion protein was expressed. This protein was refolded on the nickelNTA column in the presence of D-galactose, and soluble recombinant SHA(rSHA) was cleaved off from the Trx-SMT3 portion using His-tagged Ulp1(8). The purified rSHA is shown in FIG. 6C, together with archived SHA.To determine the carbohydrate-binding specificity of SHA and rSHA, thecommercially available Glycan Array 100 slides were used, on which 100synthetic glycans of the most frequently identified structures in theliterature are mounted in each of four sub-arrays. The glycan bindingwas measured using two distinct concentrations of biotin-labeled SHAs(1×: 20 μg/ml and 0.1×: 2 μg/ml) in the absence and presence of 0.2 ML-rhamnose as a competitive inhibitor. Both SHA and rSHA showedidentical carbohydrate-binding specificities (FIGS. 6A, 6B, and 7).Strong signals were observed for SHA/rSHA binding to: β-Gal-; α-Rha-;Gal-α-1,3-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-Gal-β-1,4-Glc-β-;Gal-α-1,4-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-(Fuc-α-1,2)-Gal-β-[Blood Bantigen trisaccharide]; Gal-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β- [Blood Bantigen tetrasaccharide]; and Gal-α-1,3-Gal-β-. Also, weaker, but stillsignificant, binding to Gal-α-1,4-Gal-β-1,4-Glc-β-;GalNAc-β-1,3-Gal-β-1,4-Glc-β-; and Gal-α-1,4-Gal-β-1,4-GlcNAc-β- wasobserved. SHA/rSHA did not bind to the remaining 89 chip-immobilizedglycans (Table 4).

TABLE 4 Array Position Glycan 1 β-Glc-Sp 3 α-Man-Sp 4 α-Fuc-Sp 6β-GlcNAc-Sp 7 β-GalNAc-Sp 8 Tobramycin 9 Gal-β-1,3-GlcNAc-β-Sp 11Neu5Ac-α-2,3-Gal-β-1,3-GlcNAc-β-Sp 12 Neu5Ac-α-2,6-Gal-β-1,3-GlcNAc-β-Sp13 Neu5Gc-α-2,3-Gal-β-1,3-GlcNAc-β-Sp 14Neu5Gc-α-2,6-Gal-β-1,3-GlcNAc-β-Sp 15 Gal-β-1,3-(Fuc-α-1,4)-GlcNAc-β-[Lewis A] -Sp 16 Gal-β-1,4-Glc-β-Sp 19 GlcNAc-β-1,S-Gal-β-1,4-Glc-β-Sp21 Neu5Ac-α-2,3-Gal-β-1,4-Glc-β-Sp 22 Neu5Ac-α-2,6-Gal-β-1,4-Glc-β-Sp 23Neu5Gc-α-2,3-Gal-β-1,4-Glc-β-Sp 24 Neu5Ac-α-2,6-Gal-β-1,4-Glc-β-Sp 25Gal-β-1,4-(Fuc-α-1,3)-Glc-β-Sp 26GalNAc-β-1,3-Gal-α-1,4-Gal-β-1,4-Glc-β-Sp 27 GlcNAc-β-1,6-GlcNAc-β-Sp 284-P-GlcNAc-b-1,4-Man-b-Sp 29 Glc-α-1,2-Gal-a-1,3-Glc-α-Sp 30Gal-β-1,3-GalNAc-α-Sp 31 Gal-β-1,4-GlcNAc-β-Sp 32Gal-β-1,4-(Fuc-α-1,3)-GlcNAc-β- [Lewis X] -Sp 33Neu5Ac-α-2,3-Gal-β-1,4-(Fuc-α-1,3)-GlcNAc-β- [Sialyl Lewis X]-Sp 34Neu5Ac-α-2,3-Gal-β-1,3-(Fuc-α-1,4)-GlcNAc-β- [Sialyl Lewis A]-Sp 35Neu5Gc-α-2,3-Gal-β-1,3-(Fuc-α-1,4)-GlcNAc-β- [Sialyl Lewis A]-Sp 37Gal-β-1,4-GlcNAc-β-1,3-Gal-β-1,4-Glc-β- [LNnT]-Sp 38GlcA-β-1,4-GlcNAc-α-1,4-GlcA-β-Sp 39GlcNAc-β-1,6-(Gal-β-1,3)-GalNAc-α-O-Ser-Sp4 40Neu5Ac-α-2,3Gal-β-1,4-(6S)GlcNAc-β-Sp 41 GalNAc-β-1,4-GlcNAc-β-Sp2 42Neu5Ac-α-2,8-Neu5Ac-α-2,3-Gal β-1,4-Glc-β-Sp 43Neu5Gc-α-2,8-Neu5Ac-α-2,S-Gal-β-1,4-Glc-β-Sp 44GalNAc-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β- [Blood A antigen tetrose]- Sp145 GlcNAc-β-1,2-Man-α-Sp 46 Neu5Ac-α-2,3-Gal-β-Sp1 47Gal-β-1,3-GalNAc-β-1,3-Gal-β-Sp1 48 Glc-a-1,2-Gal-a-Sp 49Gal-β-1,4-(Fuc-α-1,3)-GlcNAc-β-1,3-Gal-β-Sp1 50Neu5Ac-α-2,3-Gal-β-1,4-(Fuc-α-1,3)-Glc-β- [3-Sialyl-3-fucosyllactose/F-SL]-Sp1 51 GlcNAc-β-1,4-GlcNAc-β-Sp1 52 β-D-GlcA-Sp 53Gal-β-1,4-(6S)GlcNAc-β-Sp 54GlcNAc-α-1,3-(Glc-α-1,2-Glc-α-1,2)-Gal-α-1,3-Glc-α-Sp 55Gal-β-1,3-GalNAc-β-1,4-(Neu5Gc-α-2,3)-Gal-β-1,4-Glc-β-Sp1 56 SisomicinSulfate 57 GalNAc-α-1,3-(Fuc-α-1,2)-Gal-β- [Blood A antigentrisaccharide]-Sp1 58 Fuc-α-1,2-Gal-β-1,4-GlcNAc-β- [Blood H antigentrisaccharide]-Sp1 60 Fuc-α-1,2-Gal-β-1,3-GlcNAc-β-1,S-Gal-β-1,4-Glc-β-[LNFP I]-Sp1 61 Fuc-α-1,2-Gal-β-1,4-Glc-β- [Blood H antigentrisaccharide]-Sp1 63 (Fuc-α-1,2)-Gal-β-1,4-(Fuc-α-1,3)-GlcNAc-β- [LewisY]-Sp1 64 (Fuc-α-1,2)-Gal-β-1,3-(Fuc-α-1,4)-GlcNAc-β- [Lewis B]-Sp1 65Gal-β-1,3-(Fuc-α-1,4)-GlcNAc-β-1, S-Gal-β-1,4-(Fuc-α-1,4)-Glc-β- [LewisA]-Sp1 66 Gal-β-1,3-GalNAc-β-Sp1 67 Gal-β-1,3-(Neu5Ac-α-2,6)-GalNAc-β-Sp68 Neu5Ac-α-2,6-Gal-β-1,3-GalNAc-β-Sp 69Neu5Ac-α-2,6-Gal-β-1,3-(Neu5Ac-α-2,6)-GalNAc-β-Sp 70Neu5Ac-α-2,3-Gal-β-1,3-(Neu5Ac-α-2,6)-GalNAc-β-Sp 71Neu5Ac-α-2,6-(Neu5Ac-α-2,3)-Gal-β-1,3-GalNAc-β-Sp 72GalNAc-β-1,4-(Neu5Ac-α-2,3)-Gal-β-1,4-Glc-β-[GM2]-Sp 73GalNAc-β-1,4-(Neu5Ac-α-2,8-Neu5Ac-α-2,3)-Gal-β-1,4-Glc-β- [GD2]- Sp 75β-D-Rha-Sp 76 Glc-α-1,4-Glc-β-Sp1 77 Glc-α-1,6-Glc-α-1,4-Glc-β-Sp1 78Maltotriose-β-Sp1 79 Glc-α-1,6-Glc-α-1,6-Glc-β-Sp1 80Maltotetraose-β-Sp1 81 GlcNAc-α-1,4-GlcA-β-1,4-GlcNAc-α1,4-GlcA-β-Sp 82Maltohexaose-β-Sp1 83 Maltoheptaose-β-Sp1 84 Acarbose-β-Sp1 85D-pentamannuronic acid-β-Sp1 86 L-pentaguluronic acid-β-Sp1 87D-cellose-β-Sp1 89 β-1,4-Xylotetrose-Sp1 90 Chitin-trisaccharide-Sp1 91KDN-α-2,8-Neu5Ac-α-2,3-Gal-β-1,4-Glc-β-Sp 92Neu5Ac-α-2,8-Neu5Gc-α-2,3-Gal-β-1,4-Glc-β-Sp 93Neu5Ac-α-2,8-Neu5Ac-α-2,8-Neu5Ac-α-2,3-Gal-β-1,4-Glc-β-Sp3 94Neu5Ac-a-2,8-Neu5Ac-α-2,6-Gal-b-1,4-Glc-Sp5 95Gal-β-1,3-GalNAc-β-1,4-(Neu5Ac-α-2,3)-Gal-β-1,4-Glc-β-Sp1 96 GentamicinSulfate 97 Kanamycin sulfate 98 Geneticin Disulfate Salt (G418) 99Neomycin trisulfate 100 SGP Linkers: Sp: OCH₂CH₂CH₂NH₂ Sp1:NH(CH₃)OCH₂CH₂NH₂ Sp2: OCH₂CH₂NH₂ Sp3: O(CH₂)₃NHCOCH₂(OCH₂CH₂)₅CH₂CH₂NH₂Sp4: OCH₂CH(COOH)NH₂ Sp5: NH₂(-o-phenyl)-CONH-CH₂CH₂NH₂

In the presence of L-rhamnose, SHA/rSHA binding to all glycans, exceptα-Rha, was competitively inhibited (FIGS. 6A and 6B). These results areconsistent with earlier hemagglutination inhibition or equilibriumdialysis observations in which SHA bound to L-rhamnose with a higheraffinity than to D-galactose (2-4).

Example 7: Sequence Comparison of SHA and Putative Homologues

SHA homologues were identified not only in the Streptomyces genome, butalso in the genomes of other microorganisms. Eleven putative SHAhomologues with more than 50% homology to the SHA sequence wereidentified as N-terminally truncated hypothetical proteins in thegenomes of S. lavendulae, S. sp. Mg1, S. sp. Wm4235, S. xanthophaeus, S.sp. Wm6378, S. clavuligerus, S. scabiei, Streptacidiphilus melanogenes,Lentzea sp. DHS C013, Actinobacteria bacterium, and Nocardia sp. NRRLS-836 (FIG. 8A). The N-terminal sequence of the putative SHA homologuesvaried among homologues, and a corresponding sequence was absent in SHA.In contrast, the C-terminal domain was conserved between SHA and itshomologues. Compared to the 131 amino acids of the SHA sequence, the SHAhomologues contained 15-133 additional amino acids at the N-terminalend, for a total of 172-265 amino acids.

To compare protein and DNA sequences of SHA and its homologues, aphylogenetic tree was generated (FIG. 8B). Protein sequence homologyranged from 51-99%. In the absence of SHA genetic information, S.lavendulae DNA (438 bases) was used as the reference query for SHAhomologues. DNA sequence homology ranged from 67-82%.

The primary sequence of SHA is principally made up of three homologous“SHA domains,” each consisting of 29 to 33 amino acids (Table 2).Sequence identity between the three SHA domains ranged from 60% to 70%(FIG. 8C). The three SHA domains contained an identical stretch ofeleven consecutive amino acids, GTTGQSRRMEA (SEQ ID NO: 24), at theC-terminus. Together, they comprised 92 amino acids, 70% of the total131 amino acids in SHA. Furthermore, the SHA domains showed homology totryptophan-rich ChW domains. ChW domains are almost exclusively found inthe Clostridium acetobutylicum species (9,10). Protein Q97E41 ofClostridium acetobutylicum ATCC 824 was identified as the closestclostridial homologue to SHA, using SMART (simple modular architectureresearch tool). It had 59%, 39%, and 37% identity for SHA domains 1, 2,and 3, respectively (Table 3).

Example 8: NMR Identification of Tryptophan Residues

NMR titration was used to show that the addition of L-rhamnose causedchemical shifts in NMR signals from SHA in the tryptophan indole NH andmethyl group regions (FIG. 9). This indicates that the ChW tryptophanresidues are most likely directly involved in carbohydrate binding.

Example 9: Demonstration of rSHA Binding to Microbial Surfaces

Due to the loss of the original SHA-producing Streptomyces strain 27S5,the biological role of SHA is difficult to characterize. SHA binding tomicrobial cell surfaces was demonstrated in this example. A greenfluorescent protein (GFP) SHA fusion protein (GFP-SHA) was constructedand used to stain various bacteria and fungi, and performed fluorescencemicroscopy. A representative example is shown in FIG. 10, whichdemonstrates the binding of GFP-SHA to Lactobacillus casei Shirotacells. L. casei Shirota is rich in L-rhamnose-containing cell wallglycans (11). The binding of SHA to microbial glycans may imply a rolefor SHA in complex microbial communication.

Example 10: ⁶⁸Ga PET Imaging of CD-1 Mice

PET imaging was performed using with 68-gallium labeled GFP-SHA fusionprotein in mice. ⁶⁸Ga was obtained from a ⁶⁸Ge/⁶⁸Ga generator system andchelated with 1,4,7,10-tetraazacyclododecane-1,4,7,10-tetraacetic acid(DOTA)-labeled GFP-SHA. 5.1 MBq of the resulting ⁶⁸Ga-GFP-SHA wasinjected through the tail vein of female CD1 mice, and PET imaging wasconducted at the City of Hope Small Animal Imaging Core facility. As acontrol ⁶⁸Ga-DOTA-scFv was used. The PET scanning was conducted 1.5hours after intravenous injection of the PET imaging agent. The⁶⁸Ga-DOTA-scFv had no specificity for rhamnose-containingmicroorganisms. As shown in FIG. 11, ⁶⁸Ga-labeled GFP-SHA specificallylabels parts of the small intestine and the cecum. These are the areasthat contain microorganisms.

While the control mouse only shows radioactivity uptake in the kidneysand in the bladder, which is typical for small proteins, the⁶⁸Ga-DOTA-GFP-SHA-injected mouse reveals additional strong signals fromthe cecum and the small intestine. The latter are known internal organsthat naturally contain a rhamnose-rich microbial flora.

Example 11: Staining of LS180 Cells with GFP-SHA

LS180 human colon adenocarcinoma cells were grown in Dulbecco's ModifiedEagle's Medium (DMEM) with 4.5 g/L glucose, L-glutamine, and sodiumpyruvate (Corning) with seeding at 60,000 cells/well in a 24-well dishcontaining glass coverslips. Cells were adhered overnight, after whichmedia was removed and coverslips washed 3× with phosphate-bufferedsaline (PBS). Paraformaldehyde (4%) was then added with incubation for15 min at 25° C., followed by washing 3× with PBS and blocking with 5%bovine serum albumin (BSA) in PBS for one hour. Cells were thenincubated with 10 μg/mL GFP-SHA or a GFP-only control diluted in 5% BSAin PBS overnight at 4° C. Following staining, cells were washed 3× withPBS and then mounted on glass slides with 10 μL Fluoroshield mountingmedium with 4′,6-diamidino-2-phenylindole (DAPI, Abcam). Cells were thenvisualized using a Zeiss LSM 880 with Airyscan confocal microscope,employing excitation at 488 nm for green fluorescence and 358 nm forDAPI staining. As shown in FIG. 11, GFP-SHA binds to the surface ofLS180 cells, while GFP alone does not.

Example 12: SHA Imaging of Cancer Tissues Using FITC-Labeled SHA

SHA was labeled with FITC using amine coupling chemistry. Briefly, 100μL of SHA in 0.1 M sodium carbonate, pH 9.0 at 2 mg/mL concentration wasincubated for 1 hour with 50 μg/mL of FITC in DMSO on rotary shaker at22° C. (protected from light). After incubation, the reaction wasquenched by adding 20 μL of 1 M ethanolamine following 30 minutesincubation at 22° C. Purification of labeled protein was performed using5 kDa MW cut off filter. The degree of labeling at 1.7(Fluorophore/Protein) was calculated using 68,000 cm⁻¹M⁻¹ molarextinction coefficient of the dye at pH 8.0 at 494 nm.

7 μm thick consecutive sections from each block of a paraffinizedformalin fixed tissue sample were cut and baked, and each section wasplaced on a separate slide. Sections were deparaffinized usingsequential immersions into 2 xylene baths (10 minutes each), dehydratedwith 4 baths of decreasing alcohol concentrations (100%, 95%, 70%, and50%, 5 minutes each) and 2 baths with deionized H₂O. Slides then wererehydrated with 1×PBS for 10 minutes following heat-induced antigenunmasking procedure. Briefly, slides for 8 minutes at 95° C. in 10 mMcitrate buffer, pH 6.0, rinsed gently with deionized H₂O and then with1×PBS. Staining procedure began with blocking of slides overnight at 4°C. with 1×PBS with 5% bovine serum albumin following incubation withFITC-SHA at 20 μg/mL for 1 hour at 22° C. in blocking buffer. Slideswere washed three times with 1×PBS. Counterstaining was performed with4′,6-diamidino-2-phenylindole (DAPI, 3 μM for 10 minutes, washed with1×PBS three times and examined using a fluorescent microscope.Hematoxylin and Eosin (H&E) staining was performed using standardhematoxylin and eosin staining procedure (www.nationaldiaqnostics.com).Microscopy was performed with Zeiss Axio Observer II InvertedFluorescence Microscope (Jena, Germany) and ZEN2 (Blue Edition)software.

The FITC-SHA-stained tissues of the infiltrating ductal carcinoma case(which was an infiltrating, malignant and abnormal proliferation ofneoplastic cells in the breast tissues advanced to the tumor in thepancreas) were examined. As shown in FIG. 13, the SHA clearly labeledcells in tumor tissues. Pancreatic ductal carcinoma gave very strongstaining. Other tumors did too (data not shown). The precise target thatwas SHA-labeled in the tumor sections is unknown at this point, but notwishing to be bound by any theory, it might be due to abnormal mucinssecreted by tumor cells.

Significant difference in fluorescent signal was observed between thetwo groups, the pancreatic cancer tissue and the normal tissue stainedwith FITC-SHA protein demonstrating that the FITC-SHA has higheraffinity to cancerous cells than to the normal cells.

As stated above, the foregoing are merely intended to illustrate thevarious embodiments of the present invention. As such, the specificmodifications discussed above are not to be construed as limitations onthe scope of the invention. It will be apparent to one skilled in theart that various equivalents, changes, and modifications may be madewithout departing from the scope of the invention, and it is understoodthat such equivalent embodiments are to be included herein. Allreferences cited herein are incorporated by reference as if fully setforth herein.

REFERENCES

-   1. Fujita, Y., Oishi, K., and Aida, K. (1972) Hemagglutination by    culture broth of Actinomycetes and Aspergillus. J. Gen. Appl.    Microbiol. 18, 73-75-   2. Fujita, Y., Oishi, K., and Aida, K. (1973) Sugar specificity of    anti-B hemagglutinin produced by Streptomyces sp. Biochemical and    biophysical research communications 53, 495-501-   3. Fujita, Y., Oishi, K., Suzuki, K., and Imahori, K. (1975)    Purification and properties of an anti-B hemagglutinin produced by    Streptomyces sp. Biochemistry 14, 4465-4470-   4. Fujita-Yamaguchi, Y., Oishi, K., Suzuki, K., and    Imahori, K. (1982) Studies on carbohydrate binding to a lectin    purified from Streptomyces sp. Biochimica et biophysica acta 701,    86-92-   5. Harrison, J., and Studholme, D. J. (2014) Recently published    Streptomyces genome sequences. Microbial biotechnology 7, 373-380-   6. Hoefler, B. C., Konganti, K., and Straight, P. D. (2013) De Novo    Assembly of the Streptomyces sp. Strain Mg1 Genome Using PacBio    Single-Molecule Sequencing. Genome announcements 1-   7. Fujita, Y. (1976) Studies on hemagglutinins produced by    microorganisms. The University of Tokyo-   8. Guerrero, F., Ciragan, A., and Iwai, H. (2015) Tandem SUMO fusion    vectors for improving soluble protein expression and purification.    Protein expression and purification 116, 42-49-   9. Nolling, J., Breton, G., Omelchenko, M. V., Makarova, K. S.,    Zeng, Q., Gibson, R., Lee, H. M., Dubois, J., Qiu, D., Hitti, J.,    Wolf, Y. I., Tatusov, R. L., Sabathe, F., Doucette-Stamm, L.,    Soucaille, P., Daly, M. J., Bennett, G. N., Koonin, E. V., and    Smith, D. R. (2001) Genome sequence and comparative analysis of the    solvent-producing bacterium Clostridium acetobutylicum. Journal of    bacteriology 183, 4823-4838-   10. Sullivan, L., Paredes, C. J., Papoutsakis, E. T., and    Bennett, G. N. (2007) Analysis of the clostridial hydrophobic with a    conserved tryptophan family (ChW) of proteins in Clostridium    acetobutylicum with emphasis on ChW14 and ChW16/17. Enzyme and    Microbial Technology 42, 29-43-   11. Yasuda, E., Tateno, H., Hirabayashi, J., Iino, T., and    Sako, T. (2011) Lectin microarray reveals binding profiles of    Lactobacillus casei strains in a comprehensive analysis of bacterial    cell wall polysaccharides. Applied and environmental microbiology    77, 4539-4546-   12. Zhang, Q., Ames, J. M., Smith, R. D., Baynes, J. W., and    Metz, T. O. (2009) A perspective on the Maillard reaction and the    analysis of protein glycation by mass spectrometry: probing the    pathogenesis of chronic disease. Journal of proteome research 8,    754-769-   13. Negre-Salvayre, A., Salvayre, R., Auge, N., Pamplona, R., and    Portero-Otin, M. (2009) Hyperglycemia and glycation in diabetic    complications. Antioxidants & redox signaling 11, 3071-3109-   14. Cheng, H. N., and Neiss, T. G. (2012) Solution NMR spectroscopy    of food polysaccharides. Polymer Reviews 52, 81-114-   15. Fujimoto, Z., Jackson, A., Michikawa, M., Maehara, T., Momma,    M., Henrissat, B., Gilbert, H. J., and Kaneko, S. (2013) The    Structure of a Streptomyces avermitilis α-l-Rhamnosidase Reveals a    Novel Carbohydrate-binding Module CBM67 within the Six-domain    Arrangement. The Journal of Biological Chemistry 288, 12376-12385-   16. Lamoth, F. (2016) Galactomannan and 1,3-β-D-Glucan Testing for    the Diagnosis of Invasive Aspergillosis. J. Fungi 2, 22-   17. Tateno, H., Saneyoshi, A., Ogawa, T., Muramoto, K., Kamiya, H.,    and Saneyoshi, M. (1998) Isolation and characterization of    rhamnose-binding lectins from eggs of steelhead trout (Oncorhynchus    mykiss) homologous to low density lipoprotein receptor superfamily.    The Journal of biological chemistry 273, 19190-19197-   18. Hosono, M., Ishikawa, K., Mineki, R., Murayama, K., Numata, C.,    Ogawa, Y., Takayanagi, Y., and Nitta, K. (1999) Tandem repeat    structure of rhamnose-binding lectin from catfish (Silurus asotus)    eggs. Biochimica et biophysica acta 1472, 668-675-   19. Tateno, H., Ogawa, T., Muramoto, K., Kamiya, H., and    Saneyoshi, M. (2002) Distribution and molecular evolution of    rhamnose-binding lectins in Salmonidae: isolation and    characterization of two lectins from white-spotted Charr (Salvelinus    leucomaenis) eggs. Bioscience, biotechnology, and biochemistry 66,    1356-1365-   20. Tateno, H., Ogawa, T., Muramoto, K., Kamiya, H., and    Saneyoshi, M. (2002) Rhamnose-binding lectins from steelhead trout    (Oncorhynchus mykiss) eggs recognize bacterial lipopolysaccharides    and lipoteichoic acid. Bioscience, biotechnology, and biochemistry    66, 604-612-   21. Terada, T., Watanabe, Y., Tateno, H., Naganuma, T., Ogawa, T.,    Muramoto, K., and Kamiya, H. (2007) Structural characterization of a    rhamnose-binding glycoprotein (lectin) from Spanish mackerel    (Scomberomorous niphonius) eggs. Biochimica et biophysica acta 1770,    617-629-   22. Tateno, H. (2010) SUEL-related lectins, a lectin family widely    distributed throughout organisms. Bioscience, biotechnology, and    biochemistry 74, 1141-1144-   23. Ogawa, T., Watanabe, M., Naganuma, T., and Muramoto, K. (2011)    Diversified carbohydrate-binding lectins from marine resources.    Journal of amino acids 2011, 838914-   24. Ozeki, Y., Yokota, Y., Kato, K. H., Titani, K., and    Matsui, T. (1995) Developmental expression of D-galactoside-binding    lectin in sea urchin (Anthocidaris crassispina) eggs. Experimental    cell research 216, 318-324-   25. Watanabe, Y., Tateno, H., Nakamura-Tsuruta, S., Kominami, J.,    Hirabayashi, J., Nakamura, O., Watanabe, T., Kamiya, H., Naganuma,    T., Ogawa, T., Naude, R. J., and Muramoto, K. (2009) The function of    rhamnose-binding lectin in innate immunity by restricted binding to    Gb3. Developmental and comparative immunology 33, 187-197-   26. Sharon, N. (1987) Bacterial lectins, cell-cell recognition and    infectious disease. FEBS letters 217, 145-157-   27. Ng, S. K., Huang, Y. T., Lee, Y. C., Low, E. L., Chiu, C. H.,    Chen, S. L., Mao, L. C., and Chang, M. D. (2014) A recombinant    horseshoe crab plasma lectin recognizes specific pathogen-associated    molecular patterns of bacteria through rhamnose. PloS one 9, e115296-   28. Carpentier, G. (2010) Contribution: Protein Array Analyzer for    ImageJ. ImageJ News 10, presented at the poster session at the    ImageJ User and Developer Conference, Oct. 27-29, 2010-   29. Suzanne V. Smith, Marian Jones and Vanessa Holmes (2011).    Production and Selection of Metal PET Radioisotopes for Molecular    Imaging, Radioisotopes—Applications in Bio-Medical Science, Prof.    Nirmal Singh (Ed.), ISBN: 978-953-307-748-2

1. A recombinant Streptomyces S27S5 hemagglutinin (SHA) protein havingan amino acid sequence represented by SEQ ID NO: 25 or a homologuethereof having an amino acid sequence at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, or atleast 99% identical to that of SEQ ID NO: 25, wherein the SHA protein ora homologue thereof specifically binds to L-rhamnose or D-galactose. 2.The recombinant SHA protein or a homologue thereof of claim 1, whereinthe recombinant SHA protein or a homologue thereof specifically binds toa carbohydrate that contains or terminates in a monosaccharide or anoligosaccharide selected from the group consisting of β-Gal-; α-Rha-;Gal-α-1,3-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-Gal-β-1,4-Glc-β-;Gal-α-1,4-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-(Fuc-α-1,2)-Gal-β-;Gal-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β-; Gal-α-1,3-Gal-β-,Gal-α-1,4-Gal-β-1,4-Glc-β-; GalNAc-β-1,3-Gal-β-1,4-Glc-β-; andGal-α-1,4-Gal-β-1,4-GlcNAc-β-.
 3. A fusion protein comprising afluorescent protein and SHA, a homologue thereof, or a functionalfragment of the SHA or a homologue thereof, wherein the SHA protein hasan amino acid sequence represented by SEQ ID NO: 25, the homologue hasan amino acid sequence at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, or at least 99%identical to that of SEQ ID NO: 25, and wherein the fusion proteinspecifically binds to L-rhamnose or D-galactose.
 4. The fusion proteinof claim 3, wherein the fusion protein specifically binds to acarbohydrate that contains or terminates in a monosaccharide or anoligosaccharide selected from the group consisting of β-Gal-; α-Rha-;Gal-α-1,3-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-Gal-β-1,4-Glc-β-;Gal-α-1,4-Gal-β-1,3-GlcNAc-β-; Gal-α-1,3-(Fuc-α-1,2)-Gal-β-;Gal-α-1,3-(Fuc-α-1,2)-Gal-β-1,4-Glc-β-; Gal-α-1,3-Gal-β-,Gal-α-1,4-Gal-β-1,4-Glc-β-; GalNAc-β-1,3-Gal-β-1,4-Glc-β-; andGal-α-1,4-Gal-β-1,4-GlcNAc-β-.
 5. The fusion protein of claim 3, whereinthe fluorescent protein includes GFP and mCherry1.
 6. A method ofdetecting a microbial infection caused by a microorganism expressing acarbohydrate containing L-rhamnose or D-galactose on the surface in asubject, comprising: contacting a sample obtained from the subject withthe fusion protein of claim 3, detecting the fluorescence level in thesample, wherein the presence of fluorescence indicating a microbialinfection.
 7. The method of claim 6, wherein the sample includes abiopsy sample, a tissue sample, a bronchoalveolar lavage sample, a bloodsample, and a urine sample.
 8. The method of claim 6, wherein themicrobial infection is caused by a bacterium or a fungus that expressesdTDP-4-dehydrorhamnose reductase gene (rmID).
 9. The method of claim 6,wherein the microbial infection includes mycoses caused by Candidaalbicans, Aspergillus fumigatus, or Fusarium solani.
 10. The method ofclaim 6, wherein the microbial infection is an infection byStreptococcus, Enterococcus or Lactococcus.
 11. The method of claim 6,wherein the microbial infection is invasive pulmonary aspergillosis, andthe GFP-SHA fusion protein detects the presence of fungal galactomannan.12. A method of detecting a cancer or tumor in a subject, comprising:contacting a sample obtained from the subject with the fusion protein ofclaim 3, detecting the fluorescence level in the sample, wherein thepresence of fluorescence indicating the presence of the cancer or tumor,wherein the cancer or tumor cell expresses an antigen comprising acarbohydrate capable of specifically binding to SHA, a homologuethereof, or a fragment of the SHA or a homologue thereof.
 13. The methodof claim 12, wherein the sample includes a biopsy sample, a tissuesample, a bronchoalveolar lavage sample, a blood sample, and a urinesample.
 14. The method of claim 12, wherein the cancer is colon cancer,pancreatic ductal carcinoma, or pancreatic cancer.
 15. The method ofclaim 12, wherein the antigen is a surface antigen comprising acarbohydrate containing D-galactose.
 16. A positron emission tomography(PET) probe comprising the recombinant SHA protein or a homologuethereof of claim 1 labeled with a PET isotope, or the fusion protein ofclaim 3 labeled with a PET isotope.
 17. A method of imaging an organ ortissue having a microbial infection caused by a microorganism expressinga carbohydrate containing L-rhamnose or D-galactose on the surface in asubject, comprising: administering to the subject the PET probe of claim16, and imaging the organ or the tissue having the microbial infectionby a PET scanning of the subject.
 18. The method of claim 17, whereinthe PET probe is administered locally or systemically to the organ orthe tissue.
 19. The method of claim 17, wherein the PET probe isadministered by intravenous injection.
 20. A method of imaging a solidtumor in a subject, comprising: administering to the subject the PETprobe of claim 16, and imaging the solid tumor by a PET scanning of thesubject, wherein the tumor cell expressing a tumor-specific antigencomprising a carbohydrate capable of specifically binding to SHA, ahomologue thereof, or a fragment of the SHA or a homologue thereof. 21.The method of claim 20, wherein the PET probe is administered locally tothe tumor.
 22. The method of claim 20, wherein the antigen is a surfaceantigen comprising a carbohydrate containing D-galactose.
 23. A methodof detecting the location of a microbial infection caused by amicroorganism expressing a carbohydrate containing L-rhamnose orD-galactose on the surface in a subject, comprising: administering tothe subject the PET probe of claim 16, and detecting the location of themicrobial infection by a PET scanning of the subject, wherein thepresence of the PET probe indicating the location of the microbialinfection.
 24. The method of claim 23, wherein the PET probe isadministered to the subject by intravenous injection.
 25. A method ofdetecting the location of cancer cells in a subject, comprising:administering to the subject the PET probe of claim 16, and detectingthe location of the cancer cells by a PET scanning of the subject,wherein the presence of the PET probe indicating the location of thecancer cells, and wherein the cancer cells express an antigen comprisinga carbohydrate capable of specifically binding to SHA, a homologuethereof, or a fragment of the SHA or a homologue thereof.
 26. The methodof claim 25, wherein the PET probe is administered to the subject byintravenous injection.
 27. The method of claim 25, wherein the antigenis a surface antigen comprising a carbohydrate containing D-galactose.